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The 3' splice site of the influenza A 
segment 7 transcript is utilized 
to produce mRNA for the critical M2 
ion-channel protein. In solution a 
63 nt fragment that includes this region 
can adopt two conformations: a pseu- 
doknot and a hairpin. In each confor- 
mation, the splice site, a binding site for 
the SF2/ASF exonic splicing enhancer 
and a polypyrimidine tract, each exists 
in a different structural context. The 
most dramatic difference occurs for 
the splice site. In the hairpin the splice 
site is between two residues that are 
involved in a 2 by 2 nucleotide internal 
loop. In the pseudoknot, however, these 
bases are canonically paired within one 
of the pseudoknotted helices. The con- 
formational switching observed in this 
region has implications for the regula- 
tion of splicing of the segment 7 mRNA. 
A measure of stability of the structures 
also shows interesting trends with 
respect to host specificity: avian strains 
tend to be the most stable, followed by 
swine and then human. 

Overview 

Influenza viruses are divided into three 
major clades: influenza A, B and C. Of 
these, influenza A and B are the most dan- 
gerous as they cause seasonal epidemics. 
There are an estimated three to five mil- 
lion serious infections yearly, resulting in 
approximately 500,000 deaths. 1 The eco- 
nomic cost of seasonal influenza is very 
steep. In the US alone the yearly cost is 
$10.4 billion in direct medical expenses 
and $ 16.3 billion in lost wages due to 



illness or death. 2 Even more troubling 
is the propensity of influenza A strains 
to cause pandemic outbreaks. Influenza 
A is one of the major killers of the 20th 
century. The Spanish Flu of 1918, caused 
by a pandemic H1N1 avian strain, killed 
at least 20 million people 3 and perhaps 
as many as 100 million. 4 A major factor 
in the rise of pandemics is the propen- 
sity of strains from different host species, 
e.g., human and avian, to combine and 
form a novel strain, to which humans are 
immunologically naive. 5,6 Re-assortment 
in influenza is possible because the virus 
has a segmented genome. There are eight 
negative-sense RNA segments that make 
up the influenza A genome. Viral RNAs 
(vRNAs) bind to multiple copies of virally 
encoded NP protein, and the heterotri- 
meric viral polymerase to form vRNPs, 
which are uniquely packaged into active 
virions. 

RNA structure plays an important role 
in the formation of the vRNP and in the 
replication of the vRNA. The 5' and 3' 
untranslated regions of the vRNAs have 
conserved complementary regions which 
allow for the formation of long range 
base pairs. 7 ' 9 The association of these 
ends forms the promoter necessary for the 
initiation of RNA synthesis 10 and RNA 
structure may influence which positive- 
sense RNA [(+)RNA] is produced: 11 ' 14 
protein coding mRNA or template com- 
plementary RNA (cRNA) that is used for 
producing more vRNA. This interaction 
is stable under physiological conditions; 15 
the particular structure, however, remains 
controversial, with two competing mod- 
els: a stem-like "panhandle" structure 16 ' 17 
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Figure 1. Cartoon of the influenza A segment 7 transcript indicating the M1 and M2 open reading 
frames (ORFs) with orange and green, respectively. The 3' end of the Ml ORF lies within the M2 
coding region and is indicated with a dashed line. The splice sites are indicated with red arrows. 
Below are the two secondary structures of the 3' splice site region where the branch point, 
polypyrimidine tract, and SF2/ASF exonic splicing enhancer site are indicated with purple, blue 
and orange, respectively. 



or a more complex "corkscrew" model. 18 
Beyond this region there is little evidence 
for additional structure in the vRNA. 
Calculations of global folding free energy 
indicate that the vRNA is much less stable 
than the (+)RNA and shows little propen- 
sity for maintaining RNA structure in 
influenza A strains. 19 

In contrast to the vRNA, the influ- 
enza A (+)RNA shows evidence for stable, 
global RNA structure in four of the eight 
viral segments: segments 1, 5, 7 and 8. 1S 
Additionally, a survey for local RNA struc- 
ture revealed 20 regions where the (+)RNA 
showed a propensity for forming unusually 
stable and conserved predicted RNA struc- 
ture. 20 Several sites also occurred in regions 
with suppressed third codon variability, 
which indicates the possible constraint of 
maintaining RNA structure. 21 Five regions 
of special interest occurred within or near 
to functional sites, showed suppressed 
codon evolution and were within or near 
to predicted structured regions. 20 These 
sites occurred in segments 8, 7 and 2. 



The best characterized of these pre- 
dicted regions occurs at the 3' splice site 
in the segment 7 transcript. Segment 7 
encodes the Ml matrix protein and the 
alternatively spliced M2 ion channel pro- 
tein as well as the less well-understood 
M3 polypeptide and M4 protein. 22 The 
conserved structural region encompasses 
the 63 nt surrounding the splice site and, 
in addition to the splice site, contains 
multiple splicing signals (Fig. 1): a bind- 
ing site for the SF2/ASF exonic splicing 
enhancer, 23 a polypyrimidine tract and a 
putative branch point signal, are all con- 
tained within this region. This region 
folds as both a pseudoknot and a hairpin, 
and each conformation places these splic- 
ing signals in different structural contexts 
(Fig. I). 24 When folded in the presence 
of Mg 2+ the ratio of each conformation 
is roughly 50/50, which implies a similar 
free energy of folding for each structure. 
Such a delicate equilibrium may be eas- 
ily influenced by changes in the cellular 
environment or the effects of proteins. A 



similar, but structurally distinct, confor- 
mational switch, from hairpin to pseu- 
doknot, was discovered in the 3' splice site 
of the influenza segment 8 transcript. 25 
The structures discovered in segment 8 
comprise a distinct family of structured 
RNAs conserved between influenza A and 
B. 26 Segment 7, in contrast, is not spliced 
in influenza B and the pseudoknot/ 
hairpin are not predicted to form in 
influenza B. 

RNA secondary structure is known to 
play important roles in alternative splic- 
ing. 27 In particular, hiding or revealing 
splice sites 28 " 31 and protein binding sites 32 " 
34 are mechanisms used in nature to regu- 
late the splicing of mRNA. The splicing 
of the M2 mRNA of influenza A is timed 
to produce this protein late in viral infec- 
tion, and this product is roughly 5% as 
abundant as the Ml transcript. 35 Thus, 
RNA conformational switching could be 
involved in the regulation of the amount 
and timing of splicing. This raises the 
possibility of specifically targeting either 
or both conformations of the 3' splice 
site of segment 7 to modulate biologi- 
cal activity or for the application of oli- 
gonucleotide 36 " 38 or small-molecule 39,40 
therapeutics. For example, the hairpin 
and pseudoknot structures were probed 
with a library of 861 unique pentamer and 
hexamer oligonucleotides. The observed 
binding pattern was unique and specific to 
each conformation. 24 

RNA Structures 

The hairpin. The hairpin conformation 
and structurally relevant mutations of the 
3' splice site are shown in Figure 2A. PI 
and P2 are common structures between 
the hairpin and pseudoknot. PI, which 
contains the polypyrimidine tract and 
putative branch point, is less stable and 
more accessible to enzymes in the hairpin 
conformation. 24 P2 and P3 are separated 
by a 2 by 2 nucleotide internal loop and 
the splice site occurs within the 5' side 
of this loop. The nucleotides in this loop 
likely form non-canonical GA and GG 
pairing interactions. 24 Similar GA/GG 
loops are observed in the HIV-1 Rev bind- 
ing domain 41 and the ribosomal loop-E 
motif 2 and are important for protein rec- 
ognition. The P3 hairpin loop is comprised 



1306 



RNA Biology 



Volume 9 Issue 1 1 



Table 1. Canonical and non-canonical base pair composition for helixes of the pseudoknot, top, and P3' of the hairpin conformation, bottom. Single 
point mutations that would preserve canonical pairing are in green, while double point mutations that would preserve pairing are in blue. Areas where 
the most frequent pairing is expected to be non-canonical are indicated with italics. Numbers are from all available influenza A sequences (NCBI). 
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of a hexamer terminal loop and a six base 
pair stem, which is interrupted by a single 
bulged A at nt 730. Most of the terminal 
loop and the 3' half of P3 comprise the key 
residues for binding the SF2/ASF exonic 
splicing enhancer protein 23 (Fig. 1). This 
purine-rich binding site extends into the 3' 
side of the 2 by 2 nt internal loop and two 
nts of the P2 stem (Fig. 1). 

The hairpin structure is well conserved. 
Canonical base pairing is, on average, 
96.8% conserved (Table 1). Each helix 
is supported by at least one compensatory 
(double point mutation that preserves base 
pairing) or consistent mutation (single 
point mutation that preserves canonical 
pairing; Fig. 2A). 

The pseudoknot. In the pseudoknot, 
nucleotides that make up P3 in the 



hairpin conformation are re-arranged to 
make P3', and nucleotides 714 to 717 are 
paired in the PO helix (Fig. 2B). In this 
structure the 3' splice site is canonically 
paired in the middle of the PO helix. In 
addition to four canonical pairs, P3' has 
several non-canonical pairing possibilities 
(Fig. 2B). The loop of P3' may contain 
three continuous GA pairs. Continuous 
stretches of three purine-purine pairs (the 
3Rs motif) are especially stabilizing in 
internal loops. 43 

PO and P3' are both well conserved, 
with canonical pairing 99.0 and 100.0% 
conserved, respectively (Table 1). There 
are five consistent mutations in PO, but 
only a single change at C720 in P3'. 
Interestingly, when mutations occur 
within the loop of P3', they always lead 



to purines at these positions (Table 1). 
This observation supports the potential 
formation of a structure similar to the 3Rs 
motif. 

Evolutionary constraints and species 
distribution. The 3' splice site structured 
region was initially discovered by identi- 
fying parts of the coding RNA with con- 
strained evolution of synonymous sites. 20 
This implies a need to maintain RNA 
secondary structure in addition to protein 
sequence, 21 which reduces the allowable 
synonymous site substitutions. Indeed, 
the sequence in this structured region has 
multiple constraints on its evolution: in 
addition to encoding both the hairpin and 
pseudoknot structures, it must maintain 
the Ml open reading frame (ORF) and, 
after nt 714, the M2 ORF, an SF2/ASF 
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Figure 2. Consensus secondary structures of the hairpin (A) and pseudoknot (B) conformations. Structurally relevant mutations that occurred with a 
frequency greater than two in the alignment of all available sequences are annotated: consistent mutations, single point mutations that preserve pair- 
ing, are indicated with green, while compensatory mutations, double point mutations that preserve pairing, are indicated in blue. 



protein binding site, and the polypyrimi- 
dine and branch point sequences (Fig. 1). 
These constraints explain the small num- 
ber of double point mutations (compensa- 
tory changes) that preserve canonical base 
pairing compared with single point muta- 
tions (consistent changes; Table 1). For 
example, residues 689 and 702 are most 
often an AU pair (Fig. 2B), but mutate to 
GU and AC pairs with much higher fre- 
quency than the GC double point muta- 
tion. In no case do compensatory changes 
outnumber single mutations and many 
single mutations resulted in non-canon- 
ical pairs. Mutations from canonical to 
non-canonical pairs resulted in primarily 
CA pairs followed by GA pairs (Table 1). 
CA and GA pairs are commonly observed 
non-canonical pairs that are able to main- 
tain helicicity in RNA structures; 44 " 46 
they also play important roles in molecu- 
lar recognition. 47,48 Thus, it may be hard 
to make double point mutations that do 
not alter the protein coding sequence, as 
synonymous sites are rarely paired in these 
structures. 

An interesting trend was observed 
when all available influenza A sequences 
for the 3' splice site structure were sorted 



by expected stability. Sequences with 
the highest fraction of canonical and 
GC pairs, which are expected to sta- 
bilize structure, were overwhelmingly 
comprised of avian specific strains. 24 As 
the fraction of canonical and GC pairs 
decreased, the percentage of avian spe- 
cific strains decreased. Sequences with 
the smallest fraction of canonical and GC 
pairs were mainly from human specific 
strains with the percentage decreasing 
with increasing canonical and GC pairing 
potential. Swine specific strains tended to 
fall in between. Interestingly, no signifi- 
cant difference was observed in the stabil- 
ity of hairpin vs. pseudoknot in terms of 
host-specific structural stability. 23 Perhaps 
the ratio of each conformation needs to be 
maintained irrespective of host. 

This trend in host-specific structural 
stability was explored in the context of 
whole coding region folding thermody- 
namics and was found to be a general 
phenomenon in influenza A." In gen- 
eral, avian sequences are more stable 
than human sequences and swine specific 
strains fall in between. This trend paral- 
lels the temperature at which the virus 
must replicate. The human and swine 



respiratory tract, and avian gut are 33, 
37 and 42°C, respectively. 49 Perhaps the 
RNA structure is optimized to perform its 
function at each temperature. 

Concluding Remarks 

A new structured RNA family has been 
characterized in influenza A. This family 
joins the structured splice site from seg- 
ment 8 in the growing collection of known 
influenza RNA structures. 26 Both of these 
structured regions are proposed to influ- 
ence splicing of their respective mRNAs. 
Conformational switching places sites that 
are functionally important for splicing in 
different structural contexts. In particular, 
these sites are expected to be more acces- 
sible in the hairpin conformation than in 
the pseudoknot. Switching between hair- 
pin and pseudoknot may be a conserved 
mechanism for modulation of splicing in 
influenza. Such a switch makes an attrac- 
tive target for RNA therapeutics as either 
structure may be specifically targeted with 
small molecules or oligonucleotides to 
inhibit the virus. 

A seed alignment for this structured 
region, created by collapsing alignments 
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to only include non-redundant sequences, 
has been submitted to the Rfam database. 

Supplemental Material 

Supplemental material may be found here: 
www.landesbioscience.com/journals/ 
rnabiology/ article/22343 
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